Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-38478434

ABSTRACT

Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powerful AI strategy, deep learning techniques have extensively promoted the development of visual speech learning. Over the past five years, numerous deep learning based methods have been proposed to address various problems in this area, especially automatic visual speech recognition and generation. To push forward future research on visual speech, this paper will present a comprehensive review of recent progress in deep learning methods on visual speech analysis. We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance. Besides, we also identify gaps in current research and discuss inspiring future research directions.

2.
IEEE Trans Image Process ; 33: 957-971, 2024.
Article in English | MEDLINE | ID: mdl-38252569

ABSTRACT

Clustering is a fundamental and important step in many image processing tasks, such as face recognition and image segmentation. The performance of clustering can be largely enhanced if relevant weak supervision information is appropriately exploited. To achieve this goal, in this paper, we propose the Compound Weakly Supervised Clustering (CSWC) method. Concretely, CSWC incorporates two types of widely available and easily accessed weak supervision information from the label and feature aspects, respectively. To be specific, at the label level, the pairwise constraints are utilized as a kind of typical weak label supervision information. At the feature level, the partial instances collected from multiple perspectives have internal consistency and they are regarded as weak structure supervision information. To achieve a more confident clustering partition, we learn a unified graph with its similarity matrix to incorporate the above two types of weak supervision. On one hand, this similarity matrix is constructed by self-expression across the partial instances collected from multiple perspectives. On the other hand, the pairwise constraints, i.e., must-links and cannot-links, are considered by formulating a regularizer on the similarity matrix. Finally, the clustering results can be directly obtained according to the learned graph, without performing additional clustering techniques. Besides evaluating CSWC on 7 benchmark datasets, we also apply it to the application of face clustering in video data since it has vast application potentiality. Experimental results demonstrate the effectiveness of our algorithm in both incorporating compound weak supervision and identifying faces in real applications.

3.
IEEE Trans Cybern ; 54(3): 1708-1721, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37027768

ABSTRACT

With the advent of vast data collection ways, data are often with multiple modalities or coming from multiple sources. Traditional multiview learning often assumes that each example of data appears in all views. However, this assumption is too strict in some real applications such as multisensor surveillance system, where every view suffers from some data absent. In this article, we focus on how to classify such incomplete multiview data in semisupervised scenario and a method called absent multiview semisupervised classification (AMSC) has been proposed. Specifically, partial graph matrices are constructed independently by anchor strategy to measure the relationships among between each pair of present samples on each view. And to obtain unambiguous classification results for all unlabeled data points, AMSC learns view-specific label matrices and a common label matrix simultaneously. AMSC measures the similarity between pair of view-specific label vectors on each view by partial graph matrices, and consider the similarity between view-specific label vectors and class indicator vectors based on the common label matrix. To characterize the contributions of different views, the p th root integration strategy is adopted to incorporate the losses of different views. By further analyzing the relation between the p th root integration strategy and exponential decay integration strategy, we develop an efficient algorithm with proved convergence to solve the proposed nonconvex problem. To validate the effectiveness of AMSC, comparisons are made with some benchmark methods on real-world datasets and in the document classification scenario as well. The experimental results demonstrate the advantages of our proposed approach.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14789-14806, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37610915

ABSTRACT

With the emergence of new data collection ways in many dynamic environment applications, the samples are gathered gradually in the accumulated feature spaces. With the incorporation of new type features, it may result in the augmentation of class numbers. For instance, in activity recognition, using the old features during warm-up, we can separate different warm-up exercises. With the accumulation of new attributes obtained from newly added sensors, we can better separate the newly appeared formal exercises. Learning for such simultaneous augmentation of feature and class is crucial but rarely studied, particularly when the labeled samples with full observations are limited. In this paper, we tackle this problem by proposing a novel incremental learning method for Simultaneous Augmentation of Feature and Class (SAFC) in a two-stage way. To guarantee the reusability of the model trained on previous data, we add a regularizer in the current model, which can provide solid prior in training the new classifier. We also present the theoretical analyses about the generalization bound, which can validate the efficiency of model inheritance. After solving the one-shot problem, we also extend it to multi-shot. Experimental results demonstrate the effectiveness of our approaches, together with their effectiveness in activity recognition applications.

5.
IEEE Trans Image Process ; 32: 3702-3716, 2023.
Article in English | MEDLINE | ID: mdl-37405881

ABSTRACT

In image processing, images are usually composed of partial views due to the uncertainty of collection and how to efficiently process these images, which is called incomplete multi-view learning, has attracted widespread attention. The incompleteness and diversity of multi-view data enlarges the difficulty of annotation, resulting in the divergence of label distribution between the training and testing data, named as label shift. However, existing incomplete multi-view methods generally assume that the label distribution is consistent and rarely consider the label shift scenario. To address this new but important challenge, we propose a novel framework termed as Incomplete Multi-view Learning under Label Shift (IMLLS). In this framework, we first give the formal definitions of IMLLS and the bidirectional complete representation which describes the intrinsic and common structure. Then, a multilayer perceptron which combines the reconstruction and classification loss is employed to learn the latent representation, whose existence, consistency and universality are proved with the theoretical satisfaction of label shift assumption. After that, to align the label distribution, the learned representation and trained source classifier are used to estimate the importance weight by designing a new estimation scheme which balances the error generated by finite samples in theory. Finally, the trained classifier reweighted by the estimated weight is fine-tuned to reduce the gap between the source and target representations. Extensive experimental results validate the effectiveness of our algorithm over existing state-of-the-arts methods in various aspects, together with its effectiveness in discriminating schizophrenic patients from healthy controls.


Subject(s)
Algorithms , Learning , Humans , Image Processing, Computer-Assisted , Neural Networks, Computer , Uncertainty
6.
Brain Sci ; 13(5)2023 May 03.
Article in English | MEDLINE | ID: mdl-37239229

ABSTRACT

Dividing a pre-defined brain region into several heterogenous subregions is crucial for understanding its functional segregation and integration. Due to the high dimensionality of brain functional features, clustering is often postponed until dimensionality reduction in traditional parcellation frameworks occurs. However, under such stepwise parcellation, it is very easy to fall into the dilemma of local optimum since dimensionality reduction could not take into account the requirement of clustering. In this study, we developed a new parcellation framework based on the discriminative embedded clustering (DEC), combining subspace learning and clustering in a common procedure with alternative minimization adopted to approach global optimum. We tested the proposed framework in functional connectivity-based parcellation of the hippocampus. The hippocampus was parcellated into three spatial coherent subregions along the anteroventral-posterodorsal axis; the three subregions exhibited distinct functional connectivity changes in taxi drivers relative to non-driver controls. Moreover, compared with traditional stepwise methods, the proposed DEC-based framework demonstrated higher parcellation consistency across different scans within individuals. The study proposed a new brain parcellation framework with joint dimensionality reduction and clustering; the findings might shed new light on the functional plasticity of hippocampal subregions related to long-term navigation experience.

7.
Article in English | MEDLINE | ID: mdl-37216237

ABSTRACT

The bagging method has received much application and attention in recent years due to its good performance and simple framework. It has facilitated the advanced random forest method and accuracy-diversity ensemble theory. Bagging is an ensemble method based on simple random sampling (SRS) method with replacement. However, SRS is the most foundation sampling method in the field of statistics, where exists some other advanced sampling methods for probability density estimation. In imbalanced ensemble learning, down-sampling, over-sampling, and SMOTE methods have been proposed for generating base training set. However, these methods aim at changing the underlying distribution of data rather than simulating it better. The ranked set sampling (RSS) method uses auxiliary information to get more effective samples. The purpose of this article is to propose a bagging ensemble method based on RSS, which uses the ordering of objects related to the class to obtain more effective training sets. To explain its performance, we give a generalization bound of ensemble from the perspective of posterior probability estimation and Fisher information. On the basis of RSS sample having a higher Fisher information than SRS sample, the presented bound theoretically explains the better performance of RSS-Bagging. The experiments on 12 benchmark datasets demonstrate that RSS-Bagging statistically performs better than SRS-Bagging when the base classifiers are multinomial logistic regression (MLR) and support vector machine (SVM).

8.
Article in English | MEDLINE | ID: mdl-37220049

ABSTRACT

In many real-world applications, data may dynamically expand over time in both volume and feature dimensions. Besides, they are often collected in batches (also called blocks). We refer this kind of data whose volume and features increase in blocks as blocky trapezoidal data streams. Current works either assume that the feature space of data streams is fixed or stipulate that the algorithm receives only one instance at a time, and none of them can effectively handle the blocky trapezoidal data streams. In this article, we propose a novel algorithm to learn a classification model from blocky trapezoidal data streams, called learning with incremental instances and features (IIF). We attempt to design highly dynamic model update strategies that can learn from increasing training data with an expanding feature space. Specifically, we first divide the data streams obtained on each round and construct the corresponding classifiers for these different divided parts. Then, to realize the effective interaction of information between each classifier, we utilize a single global loss function to capture their relationship. Finally, we use the idea of ensemble to achieve the final classification model. Furthermore, to make this method more applicable, we directly transform it into the kernel method. Both theoretical analysis and empirical analysis validate the effectiveness of our algorithm.

9.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9306-9324, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37021891

ABSTRACT

In many dynamic environment applications, with the evolution of data collection ways, the data attributes are incremental and the samples are stored with accumulated feature spaces gradually. For instance, in the neuroimaging-based diagnosis of neuropsychiatric disorders, with emerging of diverse testing ways, we get more brain image features over time. The accumulation of different types of features will unavoidably bring difficulties in manipulating the high-dimensional data. It is challenging to design an algorithm to select valuable features in this feature incremental scenario. To address this important but rarely studied problem, we propose a novel Adaptive Feature Selection method (AFS). It enables the reusability of the feature selection model trained on previous features and adapts it to fit the feature selection requirements on all features automatically. Besides, an ideal l0-norm sparse constraint for feature selection is imposed with a proposed effective solving strategy. We present the theoretical analyses about the generalization bound and convergence behavior. After tackling this problem in a one-shot case, we extend it to the multi-shot scenario. Plenty of experimental results demonstrate the effectiveness of reusing previous features and the superior of l0-norm constraint in various aspects, together with its effectiveness in discriminating schizophrenic patients from healthy controls.


Subject(s)
Algorithms , Brain , Humans , Brain/diagnostic imaging , Neuroimaging
10.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 10427-10442, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37022260

ABSTRACT

Insufficient annotated data and minor lung lesions pose big challenges for computed tomography (CT)-aided automatic COVID-19 diagnosis at an early outbreak stage. To address this issue, we propose a Semi-Supervised Tri-Branch Network (SS-TBN). First, we develop a joint TBN model for dual-task application scenarios of image segmentation and classification such as CT-based COVID-19 diagnosis, in which pixel-level lesion segmentation and slice-level infection classification branches are simultaneously trained via lesion attention, and individual-level diagnosis branch aggregates slice-level outputs for COVID-19 screening. Second, we propose a novel hybrid semi-supervised learning method to make full use of unlabeled data, combining a new double-threshold pseudo labeling method specifically designed to the joint model and a new inter-slice consistency regularization method specifically tailored to CT images. Besides two publicly available external datasets, we collect internal and our own external datasets including 210,395 images (1,420 cases versus 498 controls) from ten hospitals. Experimental results show that the proposed method achieves state-of-the-art performance in COVID-19 classification with limited annotated data even if lesions are subtle, and that segmentation results promote interpretability for diagnosis, suggesting the potential of the SS-TBN in early screening in insufficient labeled data situations at the early stage of a pandemic outbreak like COVID-19.


Subject(s)
COVID-19 , Humans , COVID-19 Testing , Algorithms , Supervised Machine Learning
11.
Cereb Cortex ; 33(7): 3575-3590, 2023 03 21.
Article in English | MEDLINE | ID: mdl-35965076

ABSTRACT

Brain cartography has expanded substantially over the past decade. In this regard, resting-state functional connectivity (FC) plays a key role in identifying the locations of putative functional borders. However, scant attention has been paid to the dynamic nature of functional interactions in the human brain. Indeed, FC is typically assumed to be stationary across time, which may obscure potential or subtle functional boundaries, particularly in regions with high flexibility and adaptability. In this study, we developed a dynamic FC (dFC)-based parcellation framework, established a new functional human brain atlas termed D-BFA (DFC-based Brain Functional Atlas), and verified its neurophysiological plausibility by stereo-EEG data. As the first dFC-based whole-brain atlas, the proposed D-BFA delineates finer functional boundaries that cannot be captured by static FC, and is further supported by good correspondence with cytoarchitectonic areas and task activation maps. Moreover, the D-BFA reveals the spatial distribution of dynamic variability across the brain and generates more homogenous parcels compared with most alternative parcellations. Our results demonstrate the superiority and practicability of dFC in brain parcellation, providing a new template to exploit brain topographic organization from a dynamic perspective. The D-BFA will be publicly available for download at https://github.com/sliderplm/D-BFA-618.


Subject(s)
Brain , Magnetic Resonance Imaging , Humans , Magnetic Resonance Imaging/methods , Brain/diagnostic imaging , Brain/physiology , Brain Mapping/methods
12.
IEEE Trans Cybern ; 51(3): 1690-1703, 2021 Mar.
Article in English | MEDLINE | ID: mdl-31804950

ABSTRACT

In real-world applications, not all instances in the multiview data are fully represented. To deal with incomplete data, incomplete multiview learning (IML) rises. In this article, we propose the joint embedding learning and low-rank approximation (JELLA) framework for IML. The JELLA framework approximates the incomplete data by a set of low-rank matrices and learns a full and common embedding by linear transformation. Several existing IML methods can be unified as special cases of the framework. More interestingly, some linear transformation-based complete multiview methods can be adapted to IML directly with the guidance of the framework. Thus, the JELLA framework improves the efficiency of processing incomplete multiview data, and bridges the gap between complete multiview learning and IML. Moreover, the JELLA framework can provide guidance for developing new algorithms. For illustration, within the framework, we propose the IML with the block-diagonal representation (IML-BDR) method. Assuming that the sampled examples have an approximate linear subspace structure, IML-BDR uses the block-diagonal structure prior to learning the full embedding, which would lead to more correct clustering. A convergent alternating iterative algorithm with the successive over-relaxation optimization technique is devised for optimization. The experimental results on various datasets demonstrate the effectiveness of IML-BDR.

13.
IEEE Trans Cybern ; 51(10): 5156-5169, 2021 Oct.
Article in English | MEDLINE | ID: mdl-31545755

ABSTRACT

Multi-instance learning (MIL) has been extensively applied to various real tasks involving objects with bags of instances, such as in drugs and images. Previous studies on MIL assume that data are entirely complete. However, in many real tasks, the instance is fragmentary. In this article, we present probably the first study on multi-instance classification with fragmentary data. In our proposed framework, called fragmentary multi-instance classification (FIC), the fragmentary data are completed and the multi-instance classifier is learned jointly. To facilitate the integration between the completion and classifier learning, FIC establishes the weighting mechanism to measure the importance levels of different instances. To validate the compatibility of our framework, four typical MIL methods, including multi-instance support vector machine (MI-SVM), expectation maximization diverse density (EM-DD), citation- K nearest neighbors (Citation-KNNs), and MIL with discriminative bag mapping (MILDM), are embedded into the framework to obtain the corresponding FIC versions. As an illustration, an efficient solving algorithm is developed to address the problem for MI-SVM, together with the proof of convergence behavior. The experimental results on various types of real-world datasets demonstrate the effectiveness.

14.
IEEE Trans Cybern ; 50(5): 2124-2137, 2020 May.
Article in English | MEDLINE | ID: mdl-30530346

ABSTRACT

Different views of multiview data share certain common information (consensus) and also contain some complementary information (complementarity). Both consensus and complementarity are of significant importance to the success of multiview learning. In this paper, we explicitly formulate both of them for multiview classification. On the one hand, a cohesion-increasing loss term with a learnable label-adjusting matrix is designed to facilitate consensus among views in the training stage. With this kind of loss, the learned classifiers of all views are more likely to obtain the correct classification, thereby maximizing the agreement among views. On the other hand, an independence measurement is adopted as the diversity-promoting regularization to encourage classifiers to be diverse such that more complementary information can be captured by these "diversified" classifiers. Overall, the resultant model is capable of achieving more comprehensive and accurate classification by exploring and exploiting the common and complementary information across multiple views more thoroughly. An iterative optimization algorithm with proved convergence is proposed for training the model. Extensive experimental results on various datasets have demonstrated the efficacy of the proposed method.

15.
Cereb Cortex ; 30(1): 269-282, 2020 01 10.
Article in English | MEDLINE | ID: mdl-31044223

ABSTRACT

The human precuneus is involved in many high-level cognitive functions, which strongly suggests the existence of biologically meaningful subdivisions. However, the functional parcellation of the precuneus needs much to be investigated. In this study, we developed an eigen clustering (EIC) approach for the parcellation using precuneus-cortical functional connectivity from fMRI data of the Human Connectome Project. The EIC approach is robust to noise and can automatically determine the cluster number. It is consistently demonstrated that the human precuneus can be subdivided into six symmetrical and connected parcels. The anterior and posterior precuneus participate in sensorimotor and visual functions, respectively. The central precuneus with four subregions indicates a media role in the interaction of the default mode, dorsal attention, and frontoparietal control networks. The EIC-based functional parcellation is free of the spatial distance constraint and is more functionally coherent than parcellation using typical clustering algorithms. The precuneus subregions had high accordance with cortical morphology and revealed good functional segregation and integration characteristics in functional task-evoked activations. This study may shed new light on the human precuneus function at a delicate level and offer an alternative scheme for human brain parcellation.


Subject(s)
Connectome/methods , Parietal Lobe/anatomy & histology , Parietal Lobe/physiology , Adult , Cluster Analysis , Female , Humans , Image Processing, Computer-Assisted/methods , Magnetic Resonance Imaging , Male , Neural Pathways/anatomy & histology , Neural Pathways/physiology , Young Adult
16.
Front Hum Neurosci ; 13: 219, 2019.
Article in English | MEDLINE | ID: mdl-31316362

ABSTRACT

[This corrects the article DOI: 10.3389/fnhum.2019.00029.].

17.
Front Hum Neurosci ; 13: 29, 2019.
Article in English | MEDLINE | ID: mdl-30792634

ABSTRACT

Difference exists widely in cognition, behavior and psychopathology between males and females, while the underlying neurobiology is still unclear. As brain structure is the fundament of its function, getting insight into structural brain may help us to better understand the functional mechanism of gender difference. Previous structural studies of gender difference in Magnetic Resonance Imaging (MRI) usually focused on gray matter (GM) concentration and structural connectivity (SC), leaving cortical morphology not characterized properly. In this study a large dataset is used to explore whether cortical three-dimensional (3-D) morphology can offer enough discriminative morphological features to effectively identify gender. Data of all available healthy controls (N = 1113) from the Human Connectome Project (HCP) were utilized. We suggested a multivariate pattern analysis method called Hierarchical Sparse Representation Classifier (HSRC) and got an accuracy of 96.77% for gender identification. Permutation tests were used to testify the reliability of gender discrimination (p < 0.001). Cortical 3-D morphological features within the frontal lobe were found the most important contributors to gender difference of human brain morphology. Moreover, we investigated gender discriminative ability of cortical 3-D morphology in predefined Anatomical Automatic Labeling (AAL) and Resting-State Networks (RSN) templates, and found the superior frontal gyrus the most discriminative in AAL and the default mode network the most discriminative in RSN. Gender difference of surface-based morphology was also discussed. The frontal lobe, as well as the default mode network, was widely reported of gender difference in previous structural and functional MRI studies, which suggested that morphology indeed affect human brain function. Our study indicates that gender can be identified on individual level by using cortical 3-D morphology and offers a new approach for structural MRI research, as well as highlights the importance of gender balance in brain imaging studies.

18.
IEEE Trans Pattern Anal Mach Intell ; 41(9): 2176-2192, 2019 09.
Article in English | MEDLINE | ID: mdl-29994111

ABSTRACT

With the evolution of data collection methods, it is possible to produce abundant data described by multiple feature sets. Previous studies show that including more features does not necessarily bring positive effects. How to prevent the augmented features from worsening classification performance is crucial but rarely studied. In this paper, we study this challenging problem by proposing a safe classification approach, whose accuracy is never degenerated when exploiting augmented features. We propose two ways to achieve the safeness of our method named as SAfe Classification (SAC). First, to leverage augmented features, we learn various types of classifiers and adapt them by employing a specially designed robust loss. It provides various candidate classifiers to meet the assumption of safeness operation. Second, we search for a safe prediction by integrating all candidate classifiers. Under a mild assumption, the integrated classifier has theoretical safeness guarantee. Several new optimization methods have been developed to accommodate the problems with proved convergence. Besides evaluating SAC on 16 data sets, we also apply SAC in the application of diagnostic classification of schizophrenia since it has vast application potentiality. Experimental results demonstrate the effectiveness of SAC in both tackling safeness problem and discriminating schizophrenic patients from healthy controls.

19.
IEEE Trans Cybern ; 49(8): 3006-3019, 2019 Aug.
Article in English | MEDLINE | ID: mdl-29994238

ABSTRACT

Data recycling, which reuses the historical data to assist the present data to achieve better performance, is an emerging and important research topic. A common case is that historical examples only have features from one source while presently have more data collection ways and extract different types of features simultaneously for new examples. Previous studies assume that either historical data appear in all sources, or at least there is one type of representations for all data. In this paper, we study the challenging problem in the above common case and propose a novel semisupervised approach by leveraging nonoverlapping historical features (NHFs). It learns full representations of both historical features and present features in a latent subspace. We utilize the intrinsic geometrical structure of all data and add the label information of historical data as a hard constraint to discover a latent subspace. Then, the classification will be performed with these new representations. Moreover, we provide an efficient algorithm to solve the formulated optimization problem with proved convergence behavior, together with some insightful discussions about parameter determination. Experimental results on real-world data sets are provided to examine the effectiveness of our algorithm. Furthermore, we have also evaluated our method in face recognition. They all demonstrate the effectiveness of our proposed approach on recycling NHFs.

20.
IEEE Trans Cybern ; 49(3): 933-946, 2019 Mar.
Article in English | MEDLINE | ID: mdl-29994361

ABSTRACT

High-dimensional non-Gaussian data are ubiquitous in many real applications. Face recognition is a typical example of such scenarios. The sampled face images of each person in the original data space are more closely located to each other than to those of the same individuals due to the changes of various conditions like illumination, pose variation, and facial expression. They are often non-Gaussian and differentiating the importance of each data point has been recognized as an effective approach to process the high-dimensional non-Gaussian data. In this paper, to embed non-Gaussian data well, we propose a novel unified framework named adaptive discriminative analysis (ADA), which combines the sample's importance measurement and subspace learning in a unified framework. Therefore, our ADA can preserve the within-class local structure and learn the discriminative transformation functions simultaneously by minimizing the distances of the projected samples within the same classes while maximizing the between-class separability. Meanwhile, an efficient method is developed to solve our formulated problem. Comprehensive analyses, including convergence behavior and parameter determination, together with the relationship to other related approaches, are as well presented. Systematical experiments are conducted to understand the work of our proposed ADA. Promising experimental results on various types of real-world benchmark data sets are provided to examine the effectiveness of our algorithm. Furthermore, we have also evaluated our method in face recognition. They all validate the effectiveness of our method on processing the high-dimensional non-Gaussian data.

SELECTION OF CITATIONS
SEARCH DETAIL
...